Skip to content

test/psk: add compat vector for OpenSSL < 3.2#1068

Closed
hreinecke wants to merge 2 commits intolinux-nvme:masterfrom
hreinecke:psk-ssl-compat
Closed

test/psk: add compat vector for OpenSSL < 3.2#1068
hreinecke wants to merge 2 commits intolinux-nvme:masterfrom
hreinecke:psk-ssl-compat

Conversation

@hreinecke
Copy link
Copy Markdown
Collaborator

Older OpenSSL versions have a bug where
EVP_PKEY_CTX_add1_hkdf_info() will always overwrite the existing 'info' value, and thus calculate a different identity hash. This issue has been uncovered by the PSK testcases, and has always been present.
We have fixed this with eff0ffe ("linux: fix HKDF TLS key derivation back to OpenSSL 3.0.8"), but the PSK testcases will still fail. So add the resulting hash values for the 'compat' test, and select the correct test vector based on the OpenSSL version.

@hreinecke hreinecke force-pushed the psk-ssl-compat branch 4 times, most recently from 44e9076 to cc6ba0e Compare September 5, 2025 07:40
Newer GCC complain about 'digest' being an empty argument, which
is technically true, but in practice can only happen if version == 0,
which we already check. So add a warning to keep GCC happy.

Signed-off-by: Hannes Reinecke <[email protected]>
Older OpenSSL versions have a bug where
EVP_PKEY_CTX_add1_hkdf_info() will always overwrite the existing
'info' value, and thus calculate a different identity hash.
This issue has been uncovered by the PSK testcases, and has always
been present.
We have fixed this with eff0ffe ("linux: fix HKDF TLS key derivation
back to OpenSSL 3.0.8"), but the PSK testcases will still fail.
So add the resulting hash values for the 'compat' test, and check both
versions when testing; if either of one matches the test is good.
This avoids having to figure which of all the OpenSSL versions contain
the issue and on which it is fixed.

Signed-off-by: Hannes Reinecke <[email protected]>
@hreinecke
Copy link
Copy Markdown
Collaborator Author

What a mess. Turns out it's a waste of time trying to figure which OpenSSL versions contain the bug, and which don't. Let's check the 'compat' vector first, and if that doesn't match check the 'openssl_bug' next. And only fail the tests if both do not match; that simplifies things a lot and avoid us having to update the patch for newer OpenSSL versions.

@igaw
Copy link
Copy Markdown
Collaborator

igaw commented Sep 5, 2025

Ohhh, I just realized, that this means we have at least two different compat modes with non RFC 8446 compliance:

  • identifiers created with OpenSSL >= 3.2 (roughly)
  • identifiers created with newer version of OpenSSL

To really support the compat mode, I think both versions need to be supported.

@hreinecke
Copy link
Copy Markdown
Collaborator Author

Ohhh, I just realized, that this means we have at least two different compat modes with non RFC 8446 compliance:

* identifiers created with OpenSSL >= 3.2 (roughly)

* identifiers created with newer version of OpenSSL

To really support the compat mode, I think both versions need to be supported.

Which is what I did :-)

@igaw
Copy link
Copy Markdown
Collaborator

igaw commented Sep 5, 2025

This is just the test case. But how do we deal with nvme-cli?

@hreinecke
Copy link
Copy Markdown
Collaborator Author

This is just the test case. But how do we deal with nvme-cli?

'Compat' means 'keep the original behaviour'. In our case: keep the behaviour the previous version had. And if the previous version used OpenSSL X we should generate values based on that version. If they used OpenSSL Y, we should generate values based for that version. Both values would be different, but they would have been different with the original nvme-cli/libnvme version, too. So I don't see the need to do anything different.

@igaw
Copy link
Copy Markdown
Collaborator

igaw commented Sep 8, 2025

Currently in the code base we have two test vectors. The old one which everyone seems to use (vendors of storage arrays etc) and the SPDK implementation:

  • test_data_identity_compat: Non RFC 8446 compliant, works with OpenSSL 1.1x, OpenSSL >= 3.2
  • test_data_identity: RFC 8446 compliant works with all OpenSSL 3.x versions

With OpenSSL version between 3.0 to roughly 3.2 libnvme will create a identifier which creates a third version no one is expecting. This is of the EVP_PKEY_CTX_add1_hkdf_info bug, correct?

In this case we should update the existing code which uses several EVP_PKEY_CTX_add1_hkdf_info, replace it with one single call of this function; the same approach we have in place for RFC 8446 compliant version.

I really don't want to have different outcomes of the identifier string depending which OpenSSL is used, this is going to be nightmare to explain to users.

@hreinecke
Copy link
Copy Markdown
Collaborator Author

Neither do I. But the alternative would be to drill down for each and every OpenSSL version for each and every distribution to figure out whether the bug is present or not. @cleech tried that, but that turned out to be wrong for SUSE. And really I don't see the point here; we have been happily shipping nvme-cli with the bug, and all the 'compat' flag does is to be bug-compatible with previous versions. We can get rid of the testcase, though, if that's an issue :-)

@hreinecke
Copy link
Copy Markdown
Collaborator Author

But even if we go with your solution, we will be incompatible with some older versions, depending on the openssl version used. So the 'compat' flag will generate strings which might be incompatible with existing implementations, ie the very thing the 'compat' flag should protect against.

@igaw
Copy link
Copy Markdown
Collaborator

igaw commented Sep 9, 2025

My points is, that some version of nvme-cli will create an identifier which will not work with storage vendors implementations. My understanding is that buggy versions of EVP_PKEY_CTX_add1_hkdf_info will never generate an usable identifier.

FWIW, it would be possible to detect buggy versions EVP_PKEY_CTX_add1_hkdf_info during the meson configure step.

@mwilck
Copy link
Copy Markdown
Contributor

mwilck commented Sep 9, 2025

My ¥0.02:

If it's technically possible to calculate the keys correctly, independently of the openssl version that libnvme is linked to, we should implement that.

Then, we need to decide whether or not libnvme (upstream) needs to support peers that calculate key strings with broken openssl versions. If not, we're done here. Distros could still look for backward compatibility fixes downstream.

Otherwise, this should IMO be configured on a per-host basis in config.json. A host could have an optional attribute that tells libnvme that this host uses some legacy or broken algorithm to calculate the key strings (by naming or numbering the possible bad algorithms, this approach would be extensible for future broken peers). IMO it's important that the administrator needs to configure this explicitly in config.json. For security reasons I don't think it's a good idea to autodetect this, or to try different algorithms until we find one that is verified successfully.

AFAICS we wouldn't need changes to the CI with this approach, unless we want to add tests for those alternative algorithms.

@igaw
Copy link
Copy Markdown
Collaborator

igaw commented Sep 17, 2025

Fixed it with PR#1073

@igaw igaw closed this Sep 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants